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Abstract — We exhibit a possible road towards a strong converse 
for the quantum capacity of degradable channels. In particular, 
we show that all degradable channels obey what we call a 
"pretty strong" converse: When the code rate increases above 
the quantum capacity, the fidelity makes a discontinuous jump 
from 1 to at most asymptotically. A similar result can be 
shown for the private (classical) capacity. 

Furthermore, we can show that if the strong converse holds 
for symmetric channels (which have quantum capacity zero), 
then degradable channels obey the strong converse: The above- 
mentioned asymptotic jump of the fidelity at the quantum 
capacity is then from 1 down to 0. 

Index Tenns — quantum information, private classical informa- 
tion, channel coding, strong converse, smooth entropies, error- 
rate trade-off 



I. Introduction 

Communication via noisy channels is one of the information 
processing tasks by which, following the fundamental work of 
Shannon lf36l . we have learned to quantify information and 
noise. One of the most important models considered from 
these early days of information theory is that of a discrete 
memoryless channel, for which Shannon gave his famous 
single-letter formula for the capacity (i.e., the maximum 
communication rate achievable by asymptotically error-free 
block coding). 

The analogous model in quantum Shannon theory is the 
memoryless quantum channel J\f^"- (for asymptotically large 
integer n), given by a completely positive and trace preserving 
(cptp) map TV : C{A') C{B), with Hilbert spaces A' and B 
that we assume to be finite dimensional throughout this paper 

The quantum capacity Q{N) of M is informally defined 
as the maximum rate at which quantum information can be 
transmitted asymptotically faithfully over that channel, when 
using it n -H- oo times. 
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As for all channel capacity theorems, the quantum capacity 
theorem consists of a direct part and a converse. The direct 
part states that for rates below a certain threshold there exist 
codes with decoding error (quantified as a certain distance 
from noiseless transmission) tending to in the number of 
channel uses. The converse states that if the rate lies above 
this threshold then the error does not go to for any sequence 
of codes. To be precise, this is known as a weak converse and 
the threshold rate sometimes called weak capacity. A strong 
converse is the statement that for rates above the capacity the 
error converges to its maximum 1 as n — ?> oo. 

Strong converse theorems have been shown to hold for 
other types of information sent over memoryless quantum 
channels, including classical information encoded into product 
states 1291 . Il50l and for general input states (i.e. allowing 
the possibility of entangled input signal states) over certain 
classes of quantum channels, by |25|. The strong converse 
holds also for entanglement-assisted classical communication 
over memoryless quantum channels, by the Quantum Reverse 
Shannon Theorem |4|, |6|. Strong converses do not hold by 
default; certain quantum channels with memory have a weak 
capacity but fail the strong converse [|T3l . [[TSl . 

The paper is structured as follows: In Section Ull we recall 
the definition of codes, error criteria and the quantum capacity. 
Then, in Section HU] we discuss the weak converse for the 
quantum capacity and the possibility of strong converses. In 
Section IIVI we review the concept of degradable channels 
and the analysis of Devetak and Shor lfT6l of their quantum 
capacity. We will present the argument in a form that will aid 
in the subsequent finer analysis, proving a structural lemma 
on degradable channels along the way. Then in Section [V] 
we state and prove our first main result (Theorem |2} strongly 
bounding the rate of channels with sufficiently small error 
AU necessary auxiliary results are stated in this section, how- 
ever the proofs are relegated to the appendix. Subsequently, 
we prove an analogous rate bound for the private classical 
capacity (Theorem [14] in Section IVII i. and then show that 
a strong converse for all symmetric channels implies the 
strong converse for all degradable channels (Theorem [19] 
in Section IVIII) . In Section I Villi we discuss a semidefinite 
programming approach to deal with the symmetric channels. 
We conclude in Section |IX] with a brief discussion of what 
was achieved and highlight open problems. 

On notation: In this paper, log is always the binary loga- 
rithm, and cxp its inverse, the exponential function to base 2. 
The natural logarithm is denoted In x, the natural exponential 
function e^. 
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II. Quantum channel capacity 

For a given channel M : C{A') ^{B), we consider 
encoding and decoding of quantum information, given by 
completely positive and trace preserving (cptp) maps 

V : C{B) C{C), 

which together form a quantum code. The idea is that the 
information to be sent is subjected to the overall effective 
channel T) o M o E ■ C{C) — > C{C). For a Hilbert space 
H, we denote by 

S{H) = {p>0 s.t. Trp= 1}, 
S<{n) = {p>0 s.t. Trp < 1}, 

the set of states and sub-normalized densities, respectively. 

There are many ways of defining mathematically the notion 
that the output is a good approximation of the input, and we re- 
fer the reader to the comprehensive treatment of Kretschmann 
and Werner f26\ for a discussion of all the concomitant ways 
of defining the capacity and the proof that asymptotically and 
for vanishing error they are the same. In the present paper we 
will measure the degree of approximation between states by 
the fidelity, given as 

where the maximization is over all purifications \ip), of 
p and cr, respectively ll24l . 1471 . This definition extends to 
subnormalized density operators p, cr e S<{'H) by letting 

F{p,a) :=F(p®(l-Trp),CT®(l-TrrT)) 

= \\VpM\i + V(l-Trp)(l-Tra). 

It can be shown that both 

P{p,<j) := v/l-F(p,(7)2, and 

A{p,a) :~ aiccos F{p, a) = arcsinP(p, cr), 

called the purified distance and the geodesic distance, respec- 
tively, are metrics on 5<('H), cf. |45|. They are obviously 
equivalent, and can be shown to be equivalent to the trace 
norm distance ll22l : 

^\\p-c7\\,<Pip,a) < V\\p-a\\i. (1) 

In the subsequent definitions, we will consistently use the 
purified distance. For instance, the error of a code {£,!)) for 
J\f is defined as 

P(id,2)oA/'o£:) := sup sup P{p, {id (g) V o ^f o £)p). 

The maximum dimension |C| of C such that there exists a 
quantum code for A/^"^" with error e, is denoted N{n, e), or 
more precisely N{n, e\N) if we want to refer explicitly to the 
channel. 

If we have a code with error < e, this means that we can 
use it with the maximally entangled state |<I>)'^'^ at the input, 
to get an output state 

cr'^c" = {iA®VoJ\f o£)^ = (id(g)X>o7V^)(id(g)£)$, 



which is e-close to being maximally entangled: P($,ct) < e. 
This motivates the definition of an entanglement- generating 
code with error e, which consists of a state p"^ and a 
decoding cptp map T) : C{B) 'C(C), such that 

P($'^'^',(id(g)X>o7V)p^''^') < e. 

The maximum dimension \C\ of C such that there exists an 
entanglement-generating code for J\f®'^ with error e, is denoted 
Nsin, e), or more explicitly, Nsin, e\J\f). Clearly, N{n, e) < 
NEin,e). 

Remark Since the purified distance (id (g) I? o Af)p) = 

is concave in p, we may always 
assume that the state p on A'C in an entanglement-generating 
code is pure, as in each convex decompositions of p there is 
at least one state with an error no larger than that of p. ■ 

The quantum capacity is now defined as 

Q(JV) = inf liminf - logiV(n, e). 

One obtains the same capacity when using limsup and Ne, 
see 1261 for a proof of this and the equivalence of other 
variations of the definition. 

A Shannon-style formula for the quantum capacity was first 
stated by Lloyd fTl] and proved rigorously by Shor fW\ and 
Devetak [15J . More precisely, in these papers they prove the 
direct (achievability) part which together with the earlier result 
of Schumacher and Nielsen ll34l . 1351, who showed the same 
quantity to be an upper bound (i.e., weak converse), leads to a 
formula for the quantum capacity. We expand upon this weak 
converse in the following section. 

The formula for the quantum capacity is given in terms of 
the coherent information 

I{A)B), - -S{A\B)p = S{p^) - S{p^% 

where S{p) — — Trplogp is the von Neumann entropy, of 

a state p"^^ = (id ® J\f)(j)^^' with a "test state" on AA' . 
Namely, 

Q{J\f) = lim -g(i)(A/'®"), 

n—^oo Jl 

with the single-letter expression 

g(i) [N) = max {I{A)B)p : p = (id «) N)<p}. 

<t>es{AA') 

Remark The quantum capacity is known to be non-additive 
[44] . So is the single-letter quantity Q'-'^^N) HI], igOl, mean- 
ing that the regularization above is necessary, at least as long 
as we base our capacity formula on the coherent information. 
It is not known whether there is a single-letter formula for 
Q{M), or even an efficient approximation scheme ll39l . As 
a matter of fact, we do not even know how to characterize 
the quantum capacity of the qubit depolarizing channel as a 
function of the noise. ■ 
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III. Weak and strong converse 

The fact that the coherent information gives an upper 
bound on the quantum capacity of general channels has been 
known since Schumacher and Nielsen [34). They showed 
that for any entanglement generating code with code space 
C for a channel M : C{A') ^{B) with error e, using 
strong subadditivity together with Eq. ([T) and the Fannes 



inequality, there exists an input test state 

= (id(g)A/')0, 



lAA' 



such that with 



(l-2e)log|C| </(A)i3), + l. 
Applying this to a maximal code for J\f®'^ yields, for e < i. 



-\ogNE{n,e)< 
n 1 ~ 2en 



^ 1q(i)(AA«5- 



(1 - 2e)n 



(2) 



hence the result that for n — > oo and e — > 0, the optimal 
rate cannot exceed lim„ -Q*^^^ (A/"**"), which we know is also 
asymptotically achievable, thanks to Lloyd-Shor-Devetak. 

However, for any non-zero e > 0, the upper bound in Eq. (|2]i 
is a constant factor away from the capacity, which is the 
hallmark of a weak converse; it leaves room for a trade-off 
between communication rate and error, asymptotically. 

If the quantum capacity Q{N) is zero, Eq. (|2]l says some- 
thing a bit stronger, namely that NEin,e) < 0(1), at least 
when e < In this article we call such a statement pretty 
strong converse, i.e. a proof amounting to 

limsup-log7V_E(n, e) < (5(7V), 

at least for error e below some threshold eo. A strong converse 
would require the above for all e < 1. 

Here are two simple examples of channels for which the 
strong converse holds. 

Example (PPT entanglement binding channels). If TV is such 
that all p = (id (g) N)(t> have positive partial transpose (PPT), 
then any entanglement generating code for a maximally entan- 
gled state of Schmidt rank d, denoted using any number n 
of channel uses and even arbitrary classical communication on 
the side, is still a PPT state. Twirling by the symmetries U®U 
of the maximally entangled state does not change the fidelity 
between the resulting state and the maximally entangled state. 
But the resulting isotropic state 



1 



d2 - 1 



(1 -$<i) 



is still PPT, and it is well-known that this can only hold for 
P ^ h 1301 . I.e., the error is at least , /l — i, which in the 



setting of n channel uses (A/^**") goes to 1 exponentially fast 
for positive rates (meaning d — 2"^ for i? > 0). ■ 

Example (Ideal channel). Consider the identity id2 : 
£(0^) — C{<C?) on a qubit and an entanglement-generating 
code for n uses of it, idf" for a maximally entangled state of 
rank d. It is evident that the state shared between sender and 
receiver after the transmission is of Schmidt rank < 2", and 



so is any state obtained by the receiver's decoding. Hence the 
fidelity of the code is upper bounded by 



max {|($d|V')l '■ Schmidt rank of at most 2"} 



Consequently, as soon as the rate is above the capacity 
(5(id2) = 1, i.e. d = 2"^ for R > I, the eiTor goes to 1 
exponentially fast. ■ 

IV. Degradable and anti-degradable channels 

By the Stinespring dilation theorem, any channel can be 
defined by an isometric embedding U : A' — > B ® E 
followed by a partial trace over the environment system 
E, such that N{p) — Ti sUpW. Tracing over B rather 
than E we obtain the corresponding complementary channel, 
Af'ip) -.^Ti-bUpW. 

As we are interested in the channel's behaviour, we will 
without loss of generality assume from now on that E is cho- 
sen to be of minimal dimension (which makes U unique up to 
isometrics on E). Furthermore, since is the complementary 
channel of Af'^, we may equally reduce the dimension of B 
if necessary; this can equivalently be described as finding the 
subspace B C B that contains all supports of all JV{p) for 
states p on A', which is in fact the supporting subspace of 
J\f{i), and viewing A/" as a mapping into C{B). 

A channel M is called degradable if it can be degraded to its 
complementary channel, i.e. if there exists a cptp map such 
that M"^ — M.O J\f. Introducing the Stinespring dilation of Ai 
by an isometry V : B — s- F (E) E', the channel output system 
B can be mapped to the composite system E' ® F such that 
the channel taking A' to E is the same as the channel taking 
A' to E' (with an isomorphism between E and E' fixed once 
and for all). We may also assume F to be minimal. The above 
information process is illustrated in Fig. [T] 

If the complementary channel is degradable, i.e. \f M — 
Ai o M'^ for some cptp map, we call M anti-degradable. A 
channel that is both degradable and anti-degradable is called 
symmetric II41I . 




Fig. 1. Schematic of a degradable quantum cliannel, witli tlie input state (j> 
between A' and the reference A, the channel output and environment state 
(/J and the state ?/) shared between A, F and the two copies of the original 
environment, E and E'. 



The identity between the channels C{A') ^{E) and 
C{A') 'C(£") (defined by conjugating by VU and tracing 
over E'F and EF, respectively) is expressed by the equation 



AE 



AE' 



(3) 
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modulo the implicit isomorphism between E and E' . This was 
enough for Devetak and Shor 1 16] to prove that for degradable 
channels the coherent information is additive; see also [11^ 
Sec. A. 2]. The crucial point in their argument is that the 
coherent information can be rewritten as a conditional entropy. 



I{A)B)^ = S{F\E') 



(4) 



Then, based on the observation that the state 4'^^ on the 
r.h.s. is a linear function of the reference state = Tr A(f>, 
and using strong subadditivity, one gets subadditivity of the 
coherent information of a product channel, hence additivity of 
g(i). Below we give an alternative account of the reasoning 
leading to Eq. (HI, which while being more complicated than 
those cited, has the benefit of suggesting an extension to min- 
entropies (Section [V]i. For the class of degradable channels 
it is also known that the quantum capacity equals the private 
capacity [43 1 - see Section [Vll below. 

Denoting SWAPs^/ the swap unitary between systems E 
and i.e. SWAP|m)|u) = \v)\u) (always modulo the implicit 
identification of E with E'), we have the following statement 
strengthening Eq. (O: 

Lemma 1 Consider a degradable channel Af with Stinespring 
dilation U '■ A' ^ B ® E. Then there exists a degrading map 
M with Stinespring dilation V '■ B ^ F®E' (not necessarily 
with minimal dimension \F\) and a unitary X on F, which 
may be chosen as an involution (i.e. = 1), such that 



{Xf ® SWKVee'WU = VU. 



In particular, for arbitrary state vector l^)"^"^ and 

\^)AFEE' _ (IK^VUM, 



{lA®XF®S^MkVEE-W 



AFEE' 



AFEE' 



Proof: Start with an arbitrary dilation Vq B ^ Fo ^ E' 
of an arbitrary map Mq, and define the following isometry 

W : EE' EG, 

W:^ -^{VoU<»\Of + SWAPEE'VoU<»\lf), 
v2 

with a qubit system G. Let F — FqiS^G and Xp 1 _Fo ^^g, 
where X is the Pauli ax unitary on G. Evidently, 

W = (SWAFee' Xf)W, 
and also, since JV is degradable, 

Tte'fWpW'' ^W{p). 

Hence, the Stinespring dilations U and W are equivalent; to 
be precise, there exists an isometry V : B '-^ E'F such that 
W = VU, and we get VU = {SWAPee' «) Xf)VU. ■ 

The following reasoning uses the chain rule identity 
S{AB\C) = S{B\C) + S{A\BG) of the conditional von 
Neumann entropy, but no explicit expansion of any conditional 
entropy as a difference of two entropies. Consider a generic 
input state (j)^^ to J\f and its associated ip^^-^ and ^-4f_e_e ^ 
Now, by invariance of the conditional entropy S{A\B) = 
S{AB) — S{B) under local unitaries and the duality identity 



S{A\B) = -S{A\G) with respect to a pure state on ABC, 
combined with the above lemma, 

I{A)B)^ = ~SiA\B)^ 

= S{F\E') - S{AF\E') 
= S{F\E') + S{AF\E) 
= S{F\E') + S{AF\E'). 

This shows that S{AF\E) = 0, and we obtain Eq. 

V. Pretty strong converse 

Theorem 2 Let J\f : C{A) C{B) be a degradable channel 
with finite quantum systems A and B. Then, there exists a 
constant p such that for error e < and every integer n, 

log N{n,e) < log NE{n,e) 



< nQ^^\U) + p\ nln 



64711-41" 



3|A|2logn + 5 + 51og-, 



where X = \ - e 

Together with the direct part (achievability proved in f[5] , 
Il27i . I,38J ) we thus get: 



Corollary 3 For a degradable channel Af, the quantum ca- 
pacity is given by 

Q{N) = lim -logA^(n,e) 

n— ^oo 77, 

= lim - log NE{n,e), 

n— foo 71 

for any < e < Compared to the original definition this is 
simpler as we do not need to vary e, and there is convergence 
rather than reference to lim inf or lim sup. ■ 

The proof of this theorem will rely on the calculus of 
min- and max-entropies, of which we will briefly review the 
necessary definitions and properties; we refer the reader to ||451 
for more details. 

Definition 4 (Min- and max-entropy) For p'^^ G S< {AB), 
the min-entropy of A conditioned on B is defined as 

H,r,in{A\B)p max max{A £ R : p^^ < 2^^! ® a^}. 
(TBes{B) 

With a purification I'lp)^-^'^' of p, we define 

F{niax{A\B)p :— ~H„iin{A\C)^Ac , 
with the reduced state = Tr b'^P- 

Definition 5 (Smooth min- and max-entropy) Let e > 

and pab S S{AB). The e-smooth min-entropy of A con- 
ditioned on B is defined as 

H'„^i^^{A\B)p := maxH,^in{A\B)p,, 

P'~eP 
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where p' p means P{p', p) < e for p' € iS<(y4_B). 
Similarly, 

H:^^^iA\B)^ := min H^UA\B)p, 
p'~tP 

= — ^^min(^|C')i/'i 

with a purification ip S S{ABC) of p. 

All min- and max-entropies, smoothed or not, are invariant 
under local unitaries and local isometrics. 

Lemma 6 (Monotonicity) For a state p € S{ABC) and any 
e > 0, 

H^^MBC) < Ht^MB), 

Since every cptp map can be written as an isometry followed 
by a partial trace, this means that for every p € S{AB) and 
cptp map T ■ C{B)^ C{C), 

B^nini^\B)p < i?f„i„(A|C)(id0r)p) 
-f^max(^|5)p < -f^inax(^|C')(id<»r)p- 



The following relations generalize the well-known chain 
rule identity S{AB\C) = S{B\C) + S{A\BC) for the von 
Neumann entropy, albeit for min- and max-entropies it turns 
into one of a set of inequalities. There are eight versions of 
it P8l . of which we cite only the two we are going to use. 



Lemma 7 (Chain rules flTl, g^) Let e,S > 0, r] > 0. 

Then, with respect to the same state p G S{ABC), 

Ktl'+\AB\C) < Hl,^{B\C) + K,MBC) 

^1 2 (5) 
+ log— , 



and 



Q1 2 (6) 
31og— . 

77^ 



Lemma 8 (Proposition 5.5 in ||45]| ) Let p £ S{AB) and 
a, (3 > such that a + l3 < ^. Then, 



h:^,^{A\B), <H:il^^{A\B), + log- 



(7) 



' cos2(a + /3) 

For e, (5>0, e + (5<l this can be relaxed to the simpler form 



H^,^,^{A\B)p<Hl^M\B)p + \og 



1 - (e + (5)2 ■ 



(8) 



Lemma 9 (Dupuis |20|) Let p e S{AB) and < e < 1. 
Then, 

/Z 

(9) 



which can be rewritten and relaxed into the form 

HiMB)p < H^{A\B), 



(10) 



forO<S <1. 



Proof of Theorem ^ Consider an entanglement gen- 
eration code for log NE{n,,e) ebits of error e for the chan- 
nel J\f^''\ As observed in conjunction with the definitions, 
N{n, e) < Nsin, e) and w.l.o.g. the input state (j)^'"^ to the 
entanglement-generating code is pure (see Remark in Section 
Hil l . Similar to Fig. [T] write 

l^^lB'-F-B" _ (10(1/0 ]1)»«)|<^). 

By definition, there exists a decoding cptp map V : 
£(5") C{A'), such that a = {id(g)Vo Af)(f) has purified 
distance < e from the maximally entangled state $^^/. Note 
that \A\ = \A'\ = NE{n,e). Hence, by definition of the 
max-entropy and using its monotonicity under cptp maps 
(Lemma IS, 

logA^B(n,6)<-F^_(I|I'). 

< -i/f„,,(Ali?")^ 

= -H^^^{A\E'"'F")^. 

The latter, by the duality relation (Definition |5]l, is equal to 
H^^^{A\E), which relates the coding performance directly to 
the decoupling principle (cf. |[T9l ). But we shall not use that 
route and instead invoke the chain rule [Lemma |7] Eq. ©I, 
with r] — X ^ J (^-^ ~ ^° continue 

logNE{n,e)<H,i,^{F-\E"') 

-Ktl\AF-\En+\og^. ^^^^ 

Let us deal with the second term here first: Using duality, and 
invoking Lemma[8l Eq. O with a = P = arcsin(e + 3A) < j, 
we get 

-H^+l\AF-\E''') = ij^ir (li^-i^;") 

<HZ:{AF-\E-)+ log- ^ 



i7-L"(^i^"|i?"') + 21og 



cos2(2q;) 
1 



cos(2a) ' 

using the symmetry of t/j with respect to swapping and 
i?'", as expressed in Lemma [T] We find that 

-i7f+f(lF"|i?'")<log- ^ 



loe 



l-2(e + 3A)2 
1 



2 



1 (12) 



H:^{A\B)p < h:^JA\B)p, 



Turning to the first term in Eq. (ITll . we note that it 
is evaluated on ip^'"^"" = ■(/®"7V'®" (p("))i/t®"^ a lineai- 
function of the input density p*^") = Tr^0 G S{A'^). By 
slight abuse of notation we henceforth write 
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Now, if we knew that the maximum of this max-entropy is thus 
attained on a tensor power state p(") = p*^", then we would be 
done, by immeditately applying the asymptotic equipartition 
property (AEP) for min- and max-entropies (Proposition [T3). 
A priori, however, the state p^") is arbitrary (note that it 
eventually comes directly from the optimal code with which 
we started our reasoning), so we need to work a little more. 
To this end we shall exploit the permutation covariance of the 
channel; for any permutation tt e S'„, acting naturally on an 
n-partite system, we have 



and since n^-^^ ^ 



and by the local unitary 



invariance of the min- and max-entropies, we get 



At this point we can use a restricted concavity property of the 
max-entropy. Lemma [TOl below, and get 



for the permutation invariant state 



{F-\E'\ 



(13) 



7reS„ 



where we have also invoked Lemma |9] Eq. (fTol l. in the second 
inequality in (fTsl l. 

It is well-known that such permutation-invariant states are, 
in several meaningful senses, approximated by convex combi- 
nations of tensor power states; such a statement is known as 
(finite) de Finetti theorem, and here we use it in the form of 
the Post-Selection Lemma ifTOl (Lemma [TSlbelow)!*! 

where on the right we have the universal de Finetti state 



for a certain universal measure on states a e S{A). Without 
loss of generality, by Caratheodory's Theorem, it may be 
assumed to be supported on M < n^^^^ points, hence we 
may write 



M 
i=l 



Now we claim that 



Indeed, let p' be such that < l-(5:= l-iA^.Le., 

by the post-selection inequality and the operator monotonicity 
of the square root. 



n' 7^(")^ - 



Vl-(l-<5)2<F(p',p-; = 




We point out that it is also possible to do this using Renner's Exponential 
de Finetti Theorem |33|, which requires a little more care to employ, but 
yields bounds quite similar to the ones obtained in the following. 



> v/i-(l-^')^ 

with S' = i(5r^"l^l^ Hence, from Eqs. (O and (O, 
Lemma [8] Eq. (O, and Lemma [TT| below (with the finite- 
support decomposition of o;'"^), 

pes{A) 

+ Z\A\^\ogn + Q + \og^. 



(15) 



Putting Eqs. (fTTl i. (fT2] i and ( fTSl l together, we arrive at 



\ogNE{n,e) < max i/Af (F"|£;'")^s 



pes{A) 



3|ylplogn + 5 + 51og 



A' 



Note that the optimization over p is indeed a maximum 
since the smooth max-entropy is a continuous function of the 
state. The last step of the proof is an appeal to the quantum 
asymptotic equipartition property (Proposition [13), 



an 



and we are done. 



Remark The error is precisely that achieved asymptot- 
ically by a single 50%-50% erasure channel acting on the 
code space, and of other suitable symmetric (i.e., degradable 
and anti-degradable) channels. We draw attention to the fact 
that in the proof we encounter a symmetric state, up to a local 
unitary, ijj^^ , which can indeed be interpreted as the 

joint state between input (AF"), output (-E") and environment 
(i?'") of a suitable test state with a symmetric channel's 
Stinespring dilation. 

We need to bound its min-enti'opy, H^+^^{AF"\E"), but if 
e > then the overall smoothing parameter is strictly larger, 
and without any additional structure of the state we cannot 
upper bound the quantity further: Note that the symmetry we 
were using is consistent with an arbitrarily large entangled 
state passing through a single 50%-50% erasure channel. The 
smoothing by more than allows us to get rid of the erasure 
and pick out the successful transmission, yielding an arbitrarily 
large smooth min-entropy. 

However, in Sections IVIII and IVIIII we will discuss other 
potential approaches, which might work because they use all 
the available structure. ■ 

Here are the lemmas needed in the above proof; they are 
proved in the appendix. 

It is known that the max-entropy i?max(j4|i?)p is concave 
in the state pab [46 ,1, but this does not extend to the smoothed 
version. However, the following statement holds. 
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Lemma 10 Let p G S{AB) be a state and consider the state 
family pf^ = [Ui (8) Vi)p{Ui ® Vi^ , with unitaries Ui on A 
and Vi on B, and probabilities pi; define p := J^iPiPi- Then, 

Lemma 11 For an ensemble {pi, Pi}f£i of states pi G 
S{AB) with probabilities pi, let p = J^iPiPi- Then, for any 
< e < 1, 

H:,,,JA\B)p < max + log Af. 

Lemma 12 (Post-Selection Technique |10|) For a Hilbert 
space % of dimension d, denote by Syni"('H) the subspace of 
permutation-invariant states in li®^. Then, for every state p 
supported on Sym"('H), 

Pl^n-" [ dV'|^)(VP"=Psym"(«)' 



with the uniform (i.e., unitarily invariant) probability measure 
d?/' on pure states of %, and - by Schur's Lemma - the 
projector Psym"('H) onto the symmetric subspace. 

If p is a state on 'H'*" invariant under conjugation by 
permutations, p = 7rp7r^ for all tt G Sn, then the above can 
be applied to its purification in Sym"('H ® %'), giving 

p<n''' J dacr*^", 

with a universal probability measure da on S^H). ■ 

Finally, we state a simplified version of the asymptotic 
equipartition property for min- and max-entropies, giving 
useful bounds for every n: 

Proposition 13 (Min- and max-entropy AEP |l32l, HSi ) 

Let p e 3(1-1 ab) cind < e < 1. Then, 



1 



1 



lim ^ lim 

= S{A\B)p. 

More precisely, for a purification \^) £ ABC of p, denote 
px '■= log II ('0'''" ||> where the inverse is the generalized 
inverse (restricted to the support), for X = B,C. Then, for 
every n. 



H'^,^{A^\B^) > nSiA\B) - (/is + Pc)\/nln -, (16) 



H:^^^{A^\B^) < nS{A\B) + (pB + Pc)\l n\n-, (17) 
and similar opposite bounds via Lemma |S] 

VI. Pretty strong converse 

FOR THE private CAPACITY 

In this section we show that the argument in the previous 
section can be augmented to yield a pretty strong converse for 
the private capacity. 

We start by reviewing the basic definitions, which we adapt 
from Renes and Renner 131]: A private classical code for 
a channel J\f : C{A') ^(B) consists of a family of 



signal states p^ G S{A') (x = 1,...,M), and a decoding 
measurement (POVM) {D^)^L^, i.e. > 0, T,xD^ = ^b- 
The latter can also be viewed as a cptp map V : C{B) X. 
Postulating a uniform distribution on the messages x, the code 
gives rise to the following averaged ccq-state of input, output 
and environment: 



XXE 



M 



encoding all correlations between legal users and eavesdropper 
of the system. The error of the code is defined in terms of the 
purified distance as 



X XX 



Its privacy is defined as 



min P . 



-y 



x\x\^ ®p^ 



mm A 

pes{E) \ 



1- (]gE^(-^^(/^=^)"^'')) 



For a given channel M , we denote the largest M such that 
there exists a private classical code with error e and privacy 
(5, by Af(?i, e, (5). The (weak) private capacity of M is then 
defined as 

P(7V) = inf liminf-logM(n,e,5). 

£,i5>0 n— >oo n 

It was determined in ||9l, ifTSl . and like Q it is only known as a 
regularized characterization in general ll42l . By the monogamy 
of entanglement, we know that P{Af) > Q{N) (see the 
Remark below), but in general this inequality is strict. 

However for degradable channels, it was proved by 
Smith [43 1 that the private capacity P{M) equals the quantum 
capacity Q{M) = (3^^^(A/'), and is hence given by a simple 
single-letter formula. 

Remark The way we defined the code and the error above 
(as an average) is really that of a secret key generation 
code, analogous to the entanglement-generating codes in the 
previous section. To go from averaged error and privacy to 
essentially the same worst-case notions at the expense of 
loosing a constant fraction of the messages (hence no rate 
loss asymptotically) we use Ahlswede's observation ||2] on 
how randomization in the encoding can turn several average 
errors into only slightly worse worst-case errors. 

For a code with messages x = 1, . . . , Af and joint cq-state 
after decoding. 



ABE 



xy 



®Pxy 
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consider the reduced states 



xy 



X 

With error and privacy are defined as above, 

e = pl^p^^.ljY.\^){x\®\x){x\^ 

and 

where P = Vl — is the purified distance, a short calcula- 
tion shows that 

\ 2; / X 

\ X / X 

We will now encode messages m into uniform distributions on 
pairwise disjoint sets Km C [M] — {1, . . . , M} of cardinality 
fc, with TO = 1, . . . , iV such that kN < M. 

We will draw the elements of Ki , . . . , Kn randomly and 
without replacement from [M]. We then use Azuma's inequal- 
ity to bound the probability that for a given m and 77 > 



k 



xGK„ 



or 



xeK„ 



Namely, each of these events has probability at most p — 
2g-2fcr) ijjj^ |i4|. The input-output-environment state of the 
new code for the messages to = 1, . . . , is 



= 4^1 E P{y\x)\m){m\ |TO')(m'| ^ pi 



N ^k ^, 



Note that P(to|to) > j X^xgk -^(^^I^;)' ^"d by concavity of 
the square root, 

(to|to) > t X! ^PiA^)- 

xeK„, 

Likewise, the state of the eavesdropper for message m is 
i ^xeK Px^ '^^'^ concavity of the fidelity. 



\ xeK^ J xeK^ 



Fip^a^) 



I.e., this message will have error < e' and privacy deviation 
< 5' for these "good" to, where it is straightforward to work 
out that e' < e (1 + 75-) and 6' < S (l + ■^). In other words, 
by choosing -q = a ■ min(e^,5^) we can make the new error 
and privacy arbitrarily close to the original parameters. 



Now, we can find ifi, . . . , such that a fraction > 1 —p 
of the Km are "good", throw away the "bad" ones and we are 
left with the code we want: it has iV' > (1 -p)7V = ^ 
jj:M messages, if we choose k such that p < 1/2, which 
holds for k> M. 

— 2ri^ 

In summary, we can get a code with randomized encoding 
and worst case error e' < (1 + a)e, worst case privacy S' < 
{l + a)S, and losing a constant amount of information. Indeed 
the number of bits encoded diminishes by at most 

21og- < 2 log - +41ogi +41ogi. 
7] a e 



By definition, every entanglement-generating code of error 
e' gives rise to a private classical (secret key generation) code 
of error and privacy e', and with M = \C\ messages. Thus, 

M(n,e',e') > NE{n,e') > N{n,e'). 

Theorem 14 Let M : C{A) C{B) be a degradable channel 
with finite quantum systems A and B. Then, for error e and 
privacy 5 such that e + 2(5 < (e.g. e = 5 < ~ .2357j, 
and every integer n. 



log M{n,e,S) < nQ^^\j\f) + pJ n\n 



3|Aplogn + 9 + lllog-, 



< nQ(i)(7V) + O (V^logn) , 



■n 



where "H — \ — e — 2(5 

Together with the direct part (achievability proved in |l9l , 
ifTSl ) we thus get: 

Corollary 15 For a degradable channel M, the private ca- 
pacity is given by 

P{N) = lim -logM(n,e,(5), 

n— i-oo n 

for any e, (5 > such that e + 2(5 < ■ 

Proof: Consider a code for A^®" with M = M{n,e,5) 
messages, that has error e and is (5-private: message x (chosen 
uniformly) is encoded as (Tx G and sent through the 

channel, giving rise to an averaged cqq-state between reference 
X, output and environment S": 

X 

The "trivial" converse shows that 

logM < i/i„(X|i?") - 

cf. Renes and Renner |31 1, whose argument we briefly repeat 
here since they used trace norm rather than purified distance . 
According to the definition of privacy given above, the reduced 
state p^^ is within purified distance (5 of a product state of 
flie form ^ J2x ® P^"' hence Hf^^^{X\E'') > logM. 
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Likewise, there exists a decoding cptp map T) : C{B") X 
such that (id ® 'D)p-^^ is within e purified distance from 
the perfectly correlated state -jj ® \x){x\^ , hence 

< 0. 

Now we can purify p^B"E" ^ j.^, ^^^^^^xx'a„b"E" ^ 
introducing a dummy system Aq to hold the purifications 
(f)'^"^ of the signal states and a coherent copy X' of 
X: 



XX'B'^E" 



1 



to which we then also apply the Stinespring dilation of the 
degrading map: 



\XX'E"^F"E 



M 

X" 

With respect to tp, we thus have 

< Hl,^{F-\En - H^+l'+^^'{F-\E"'X) 

2 

+ 41og^, 



(18) 



where we have used the degradability property of the channel 
in the second line, and in the third line the chain rule, 
Lemma |7] in its two manifestations Eqs. (|5]l and (|6]l. Indeed, 

H^+^^\AB\C) < + ^max(SlAC) + log -I 



-31og^, 
T 

which we employ with the identifications F" = A, X = B, 
E'" = C, and with K = e + 3S. 

Choosing V — ^ (^'^ ^ ^ ~ 2<5^ ensures that e' :~ e + 2S + 
577 = ^ — 77 < and we can bound the second term 
on the right hand side of Eq. (fTSl l as before, in the proof of 
Theorem |2l 



77^,,(F"|£;"'X) < -H^^,^{F-\E"'X) + 21og 



277 



277 



= i/l,(F«|ii;"X'Ao) + 21og 

<i/l,(F"|£;"X')+21ogi- 

= i/l,(F"|ii;'"X) + 21og^, 

where we have used Lemma |8] then the duality between min- 
and max-entropy, then the monotonicity (Lemma |6j and finally 
the exchange symmetry between X and X' as well as between 
E and E'. As this means 



K,,^iF"\E"'X)<log 



we have by plugging this into Eq. (fTST l, 

log Af (77, e, 6) < Hl,,{F^\E"') + 3 + 9 log i, 

?7 

and the rest of the argument is as in the proof of Theorem |2] 
[cf. Eq. O]: 

1 2 -|A|2 39r)l^l^ 

< Hsl " (F"|£;'")^(„, + log 

< max HSl (F"|£;'")p«„ 

+ 3|Ap log 77^ + 6 + l0g4T, 

invoking the quantum AEP for the max-entropy (Proposi- 
tion [JSll. ■ 

VII. Strong converse for symmetric channels 

IMPLIES IT for DEGRADABLE CHANNELS 

The main result of this section. Theorem [19] is valid 
for degradable channels satisfying the following technical 
condition. 

Definition 16 We say that a degradable channel f\f is of type 
I (for invariancej if one can choose a Stinespring dilation U 
of it, and a Stinespring dilation V of a degrading channel Ai, 
such that the unitary Xp in Lemma\l\is a global phase (hence 
±1). I.e., 

{If® S'SNkVEE')UV = ±UV. 
Example (Erasure channels). The qubit erasure channel 

£q{p) = {l-q)p®q\*M 

with erasure probability q < ^ has as its complementary 
channel £^ = £i^q; as degrading map serves £t, with t = 
(augmented by the identity on |*)(*|). 
We can guess an isometric dilation of £q, 

and likewise for the degrading map, 

V:M 



\E' 



277' 



With the choice of phase e'" = 1, it is straightforward to 
verify that SWAP ee'VU = VU. 

However, since the output of an erasure channel has no 
coherences between the erasure symbol and the unerased part, 
there is considerable freedom in choosing the dilations both 
of the channel and of the degrading map. For some of them 
there is no unitary Xp as in Lemma [T] for some the unitary 
is non-trivial. Indeed, we can see this by varying a in the 
dilation U above, most choices of which leave no symmetry 
Xp, but for e'" = —1 we can choose Xp = 2|*)(*| — 1. ■ 

Example (Schur multiplier channels). Given a positive 
semidefinite n x 77-matrix > with diagonal entries 
Sii — 1 one can define a cptp map JVs on n x 7i-matrices 
by Schur/Hadamard multiplication of the input p by S: 

Afs-.p^poS, i.e. Afsimi) = S,,\i){j\- 
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It is well-known that S can be viewed as Gram matrix of 
unit vectors \ipi), . . . , \ipn)- 

Sij = {iPj\(Pi), 

suggesting a Stinespring dilation 

U■.\^)^\^)^\^,)^. 

It gives rise to the complementary channel 

so we can choose Afg itself as degrading map and essentially 
U as its dilation V (with F taking the place of B, and E' that 
of E). 
Thus, 



VU 



\E' 



which is evidently invariant under SWAP^;^;/ since the output 
state restricted to EE', Tv pVUpWY^ is supported on the 
symmetric subspace of E ® E' . ■ 

Remark We do not know whether all degradable channels 
are of type 1, not having found a counterexample so far. From 
the examples given above it is clear however that the dilations 
U and V required for a proof that a given channel is type 
I, have to be constructed carefully. The next lemma shows 
that for any degradable channel we can construct one that is 
information theoretically equivalent, and which is of type 1. ■ 

Lemma 17 For every degradable channel M ; C{A') 
C{B), the channel 

Af = Af <E) ■■ £{A') £{B (g) Bo), 
p ^ Af{p) (g) T^" , 

which attaches to the output of Af a qubit system Bq in the 
maximally mixed state, is degradable of type I. 

Proof: Clearly, Af" = Af^^r^", with a qubit system Eq, 
so the new channel is also degradable. 

Choose a Stinespring isometry U of Af and V of the 
degrading map A4 according to Lemma [T] so that we have 
a unitary involution Xp with 

{Xf (E) swap ee')VU = VU. 

Xp can have only the two eigenvalues ±1, so decompose 
F — ® F_ into the respective eigenspaces with projectors 
P+ and P_ , respectively. Of course also SWAP e e' has eigen- 
values ±1, the corresponding eigenspaces being known as 
symmetric and anti-symmetric subspace, denoted as Sym^ {E) 
and h?{E), respectively. 

The above invariance of VU under left multiplication by 
Xf®?>WAV EE' is equivalently expressed by saying that VU 
maps A' into the +l-eigenspace of Xp ® SWAPee/, which 
is 

F+ (g Sym^{E) ® F_ ® K^{E). 

In this picture we see why Xp is necessary: it is there 
to undo a possible phase of —1 induced by SWAP^;^;/ (on 
A^ (£')), by applying the same phase once more on F_. We 



can also see how to write down dilations of Af and a degrading 
map that avoid this problem: First, U : A' '-^ {B®Bq)®(E® 
Eq) with 



U\4') := 



BE^.fm + m 



V V2 



Bo Eq 



is a dilation of Af . Secondly, we define a degrading map by 
writing down directly an isometric dilation V : B ® Bq 

F (g) {E' (g> E'o): 



Vi\ip)^\b)^'>) 



C-Z 



F^E' 



°)i{vmb)), 



where 



C-Z^^^o = P+ (g) 1b„ + P_ ® Zb„ 



is a controlled-Z using the F± subspaces to trigger a Z on the 
qubit Eq (which we identify with Bq). 

It is easy to check that Ti pV ■ V^ defines a bone fide 
degrading map for Af. But it is also of type I, as it can be 
confirmed by direct calculation that 



VU\cl,) = (P+ 



(P- 



^EE 



)VU\(j)) ® 



|01) + |10)\^''^° 



EE' 



)VU\(I)) ® 



V2 

|oi) - |l o)^^^°•^° 



Since the left hand factor in the first line is in Pg) Sym^(P), 
while the analogous term in the second line is in Pg) A^(P), 
the entire expression lies in Pg) Sym^(PPo), hence under the 
simultaneous swap EEq ^ E'E'q, 

SWAP EEo:E'Ei,VU = VU, 

and we are done. ■ 

Degradable channels of type I are intimately related to 
symmetric channels, as shown in the next lemma. 

Lemma 18 Let Af be a degradable channel of type I, and 
choose a Stinespring dilation U as well as a dilation V of a 
degrading map, according to Lemma\l\ s.t. Xp — ±1. 

For any test state |(/)o) G A A' of maximal Schmidt rank, 
let IV'o)"^^^^ = VU\(j)o)^^ and denote the supporting 
subspace of tp^^ by G. 

Then there is a symmetric channel AA with Stinepring 
isometry W : G' ^ E (g) E' (i.e. SWAPpE'W = ±W) such 
that every state \^)^^^^' = VU\(I)), for |0) G AA' can be 
written as W\£,) G GEE' for a suitable test state |^) € GG', 
up to a (state-dependent) isometry W : G ^ AF: 



AFEE' 



{W®W)\^) 



GG' 



Proof: By definition, |?/'o)^^^^' G G g) P g) P', so we 
may denote it as well \'4)q)'^^^ . Choose a purification Ix)*^*^ 
with G' ~ G, so that there exists an isometiy W : G' ^ EE' 
with 



it(gW)\x) 



GG' 



\GEE' 



It is easy to see that W has the required symmetry property: 
since SWAPbb-IV^o)^-^^' = ±\i'o)^^^' , it follows that (1 ® 
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S'WAPee'W)\x)^^' =±{t(^W)\x)^'^\ and since \x) has 
maximal Schmidt rank, SWAP ee'W = ±W follows. 

Now, let {(p)"^^ be an arbitrary input test state and 

\^)AFEE' ^ VU\(f)). Then, 



\AA' 



1 I0( 



and thus 

\„L\AFEE' 



AFEE' 



GG' 



Finally, since x*^ = has support on G, there exists a 

a e 5(G) and an isometry W : G ^ AF such that 



GG' 



= : (T?(8) ]l'^')|0'^'^'• 



In total, = (W ® W)\C)^^' , which is what we 

wanted to prove. ■ 



Theorem 19 Let M : C{A) C{B) be a degradable chan- 
nel, which w.l.o.g. we assume to be of type I (by Lemma [771 ). 
Denote its environment by E and the associated symmetric 
channel by M, with Stinespring dilation W '■ G E E' 
from Lemma |7S] Then Af obeys the strong converse for its 
quantum capacity, if A4 does (note that by the no-cloning 
argument, Q{A4) = 0). More precisely, there exists a constant 
fjL such that 



log NE{n,e\N) < 7iQ^^\N) + pL\ n\xY 



+ 81og- + 0(logn) 
A 

+ \ogNE{n,l-\\M), 



with A = 

5 



Proof: We follow the initial steps of the proof of Theo- 
rem |2] until the bound 

\ogNE{n,e) < Hi,^{F-\E"')-H^+^^\AF-\E"')+\og^, 



where all entropies are with respect to the state 



\AE"-F"E' 



Now we choose A = 

5 



The first term is treated in the exact same way as we did 
there, giving 



1 \2„-|A|- 



1 \ 



Hi^,^{F-\E"') < max iJnfax" 

p(iS(A) 

+ 3|ylplogn + 6 + log 



A2 



1 



+ 3|Anogn + 6 + log— , 
A^ 

where we have used the quantum AEP (Proposition [13} once 
more. 

The second term can be upper bounded 

-H^+l\AF-\E"') = H'^^\AF-\E-)^ 



^min \'~' \^ !{t®W)1>'^\e,) 



1 



< log7V£;(n,e + 4A|X) + 41og- 

A 

using duality in the first equation and Lemma[T8]in the second, 
to rewrite the state ^ ^ (up to an isometry G" 

AF") as if a test state |^)G"g'" ^ad gone through W'^". The 
inequality in the third line is by Proposition [20l below. 

Putting these bounds together yields the statement of the 
theorem. ■ 

The following result is essentially a version of the one- 
shot decoupling proof of entanglement-distillation and random 
quantum coding, adapted so that the error is composed of a 
smoothing and a random coding component; its proof can 
be found in the appendix. Note that it gives an essentially 
matching lower bound to the upper bound we used in the proof 
of Theorem [2| It allows us to assess one of the max-entropy 
terms we encountered there in a new light. 

Proposition 20 (Cf. Buscemi/Datta [8| & Datta/Hsieh |[l2l ) 

Let U : A' ^ B ® E be the Stinespring dilation of 
a quantum channel Af and G A A' a state vector, 

\iP) := (1 i^U)\<j)) e ABC. Then, given ry > and e > 0, 
there exists an entanglement- generating code for Af, creating 
a maximally entangled state of rank d with error < rj + e, 
where 

r 



d = 



41og- 
e 



Remark We gave the very precise form of the bounds above 
to emphasize that if the strong converse holds in its exponential 
form for Ai, in the sense that for every error rate c > 0, 

limsup - logNsin, 1 - 2-"'\M) < f{c), 

n— ^oo ^ 

with some non-decreasing continuous function /(c) of c such 
that /(O) = 0, then there exists a similar function g{c) such 
that for N, 

limsup - log A^£;(n, 1 - 2-™|A/') < Q'^^\N) + g{c). 

In other words, if the error of Ai converges to 1 exponen- 
tially for positive rates, then the error of N converges to 1 
exponentially for rates exceeding Q^^\N). ■ 
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Remark The type I channel constracted in the proof of 
Lemma [Tt] is such that the composition UV of the Stinespring 
dilations and of channel and degrading channel, actually map 
the input space A' isometrically into F Sym^{E) C F ® 
E (g) E', so that Xp = 1. 

Looking at Lemma [Tsl we see that the symmetric channel 
constructed there has a dilation W ■ G ^ Sym^{E) C E®E', 
which is a restriction at the input of the "universal" symmetric 
channel S : C{Sym^{E)) C{E) with the trivial Stinespring 
dilation 

Sym^{E) ^ E(g>E'. 

To prove a full strong converse for all degradable channels, 
by Theorem [19] it is thus enough to show the strong converse 
for the channels S, for arbitrarily large dimension \E\. More 
precisely, \E\ ~ 2\A\\B\ is enough for all degradable channels 
with given input and output spaces A and B. ■ 

VIII. A SEMIDEFINITE PROGRAMMING APPROACH TO THE 
MIN-ENTROPY OF MULTIPLY SYMMETRIC STATES 

In the proof of Theorem |2] we came across a term 
—H^^^{AF^^\E'"), e' being larger than the coding error we 
want to analyze. Similarly, in the proof of Theorem[T4]we had 
-Hi,jF-\E'^X). 

In both cases, assuming w.l.o.g. that the channel J\f is 
of type I (Lemma [TtT i and using Lemma [18] we may view 
both expressions as -H^^^{G''\E"') = i7<„(G"|£:"), with 
respect to an input-output joint state of a symmetric channel 
A^®". Lemma [18] also informs us that A4 (or a trivial 
modification of A4) has a Stinespring dilation W : G ^ 
Sym^{E) C E(g)E'; in fact, w.l.o.g. G = Sym^{E) but we 
will not use this. 

Now, in the proofs of Theorems |2] and [l4l we only made use 
of the fact that A^^" is symmetric with respect to exchanging 
the entire output with the entire environment system. This 
symmetry was enough to show that for e' < this term 
can bounded by a constant; we also remarked that for larger 
e' this kind of argument cannot be applied. 

However, it is obvious that the channel has much more 
structure, which we ought to exploit. Indeed, it is symmet- 
ric with respect to exchanging the output and environment 
systems of any subset of the n instances of while leaving 
the others in place, i.e. for any / C [n], 

and so the joint state of input, output and environment, 

\^)G"E'-E'" ^ ^ VK)^"|0)^"'^'", satisfies similarly 

,„ (19) 
= V^^ ^ SWAP|^„ 

for all subsets /. 

The semidefinite programming (SDP) formulation for the 
smoothed min-entropy is given by (cf. |48|) 

2-^£.„(G"|B") ^ inin Tra^" s.t. 

^G"B"_E'" > Trp < 1, 

Tip^ > 1 - e'^ =: S, 
pG"E" < 1*^" (g) o-^". 



By duality theory (cf. PHI ) this value is equal to the dual SDP, 
given by 

r, s > 0, > 0, 

rV''^"^"^'" <XG"^"®l^'"+,s]l, 
Trc-X < 1^". 

Note that we get an upper bound on H^^^{G'^\E") from 
every dual feasible point (a triple r, s, X). The problem is 
to construct such a dual feasible point for each pure state 
ipG E E ^jjjj jjjg symmetries ([T9] l and each S > 0, such 
that Sr — s > 2"^^^'. Since so far we were unable to find 
such a construction, we leave the problem at this point to the 
attention of the reader. 

IX. Conclusion 

For degradable quantum channels, whose quantum and 
private capacities are known to be given by the single-letter 
maximization of the coherent information (which is then also 
additive on the class of all degradable channels), we have 
shown how to use the powerful min- and max-entropy calculus 
to derive bounds on the optimal quantum and private classical 
rate, for every finite blocklength n. These bounds improve on 
the well-known weak converse in that they give asymptotically 
the capacity as soon as the error (parametrized by the purified 
distance) is small enough: for Q this was the error of 
a 50%-50% erasure channel, for P we could get Since 
this says equivalently that the minimum attainable error jumps 
from to at least some threshold as the coding rate increases 
above the capacity, we speak of a "pretty strong" converse 
(halfway between a weak and a proper strong converse). 

We have shown furthermore that it is enough to prove a 
strong converse for certain universal symmetric (degradable 
and anti-degradable) channels, namely those whose Stine- 
spring dilation is the embedding of Sym^(i?) into E ® E' 
as a subspace; then the strong converse would follow for all 
degradable channels. To deal with these symmetric channels, 
and more generally with states exhibiting n-fold exchange 
symmetry between output and environment systems, we dis- 
cussed briefly a semidefinite programming (SDP) approach. 
The viability of this approach stems from the fact that bound- 
ing the relevant min-entropy can be cast as a dual SDP, 
and so upper bounds may be obtained by any single dual 
feasible point. We have not been able to carry this part of 
the programme through yet. 

Note that the proofs use the quantum AEP, but this does 
not mean that these results are restricted to i.i.d. channels. 
In fact, by using a standard discretization argument one 
can prove that for an arbitrary non-stationary memoryless 
channel A/i (g) • • • (g) Mi, where each Nt ■ C{A) C{B) is 
degradable, and sufficiently small error, the obviously defined 
logA^(n, e), logiV£;(n, e) and logM(n, e, 5) are asymptoti- 
cally ElliQ^^HM) ± o{n) — cf. fTl and HI for analo- 
gous statements for classical and classical-quantum channels, 
respectively. 
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Most channels of course are not degradable (or anti- 
degradable). For practically all these others we do not have 
any approach to obtain a strong or even just a pretty strong 
converse. One might speculate that other channels with addi- 
tive coherent information, hence with a single-letter capacity 
formula, are also amenable to our method. But already the very 
attractive-looking class of conjugate degradable channels [71 
poses new difficulties. 

A related but different question is whether the symmetric 
side channel-assisted quantum capacity Qss(A/') |41|, which 
has an additive single-letter formula, obeys a pretty strong 
converse. Note that since arbitrary symmetric side-channels 
are permitted, including arbitrarily large 50%-50% erasure 
channels, the strong converse cannot hold for this capacity, 
since even infinite rate is achievable with error Our present 
techniques, requiring bounds on the various system dimensions 
of the channel, do not to apply, and we seem to need new ideas. 

Note on related work. In 1371 . Sharma and Warsi show 
that one may formulate upper bounds on the fidelity of codes 
in terms of the rate and so-called generalized divergences. 
Their approach doesn't appear to be related to ours, but it 
is conceivable that it may lead to proofs of strong converses 
for certain channels' quantum capacity. This however seems 
to presuppose that channel parameters derived from these 
divergences have strong additivity properties, which can only 
hold for channels with additive coherent information. 

More precisely, the upper bound on the fidelity contained 
in ll37l Thm. 1] is of no direct use, much as the trivial first steps 
in the proofs of our Theorems |2] and [14] The reason is that the 
bound explicitly depends on the code, via the joint input-output 
state. The only hope at this point is to control the maximum 
of said bound over all such input-output states. It is natural to 
expect that a first step might be to show that the maximum is 
attained on product states, but even that would leave nontrivial 
work to do. Crucially, the nature of the maximum bound is not 
addressed in 137|. Instead it is shown for the quantum erasure 
channel, that the bound, evaluated on the input-output state 
corresponding to maximally mixed input (which is indeed a 
tensor power), decreases exponentially. 

This is the meaning of (37, Thm. 3], as one can discover 
from the calculation following its statement. Literally however, 
it says "The strong converse holds for the quantum erasure 
channel for the maximally entangled channel inputs", which 
might lead an unsuspecting reader to believe that indeed the 
strong converse is proved here, albeit perhaps with some re- 
striction that is left vague. The concluding paragraph unfortu- 
nately repeats this claim in the stronger words "To summarize 
our results, we have given an exponential upper bound on 
the reliability of quantum information transmission", and "We 
then apply our bound to yield the first known example for 
exponential decay of reliability at rates above the capacity for 
quantum information transmission". Nothing could be further 
from the truth; not a single instance of exponential decay of 
fidelity above the capacity has been shown within the approach 
of 1371, because the dependence on n of the maximum bound 
in ll37l Thm. 1] is not generally understood. Indeed, claims 
such as the ones quoted above, would necessarily have to 



involve a bound on all conceivable quantum codes, for large 
n, which seems difficult, to say the least. But the only code 
that 1 37, Thm. 3] covers is the trivial one of using the entire 
input bandwidth, not encoding at all. To analyze it, however, 
one hardly needs the machinery developed in f37l; it is an 
elementary exercise to show that every noisy channel exhibits 
exponential decay of fidelity for this code. 
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Appendix 

Here we present the proofs of several auxiliary results used 
in the proof of the main result, which would have broken the 
flow of the text. 

Proof of Lemma \TU[ Define the auxiliary state 



-ABX 



til 



X 



SO that the average of the pi becomes p — Tij^p' 
Choosing purifications ipf^'-^, we can consider the following 
purification of J)^^^: 

\ABCXY 



Then, using monotonocity (Lemma |6]l and duality. 



>m 



lin 



(20) 



observing ip'^^^'' = J2iPi'4'f^ ^ KXiT- 

Now, by definition of the smooth min-entropy, its exponen- 
tial is give by the following optimization: 



minTro-'^'^ s.t. 



CY 



p>0, Trp < 1, 



Since ip^'^'^ is invariant under phase unitaries on Y, we may 
assume w.l.o.g. that both p and a have the same property, 
i.e. they may be assumed to be classical on Y: 



CY 
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where qi > 0, J^i Qi = ^ ™d pi E S<:{AC); furthermore > 
0. With these notations, the objective function in the above 
optimization is Tra^^ — X^i 9*^1 crf^, the first constraint is 
equivalent to pf^ < 1^ <^ a'^ for all i, and 

i 

Thus, observing that the V'i*'^ ^6 related to V'l"^ = Tr s^/ii 
by local unitaries, we have 

$,(^^^^) =min^g,$,^(V,^^f^) s.t. 



i 

min^g.^e.lV-f^) s.t. 



where the variables are qi and e^. 

Now, Cauchy-Schwarz inequality says 



Hence the constraint implies that ^ . q,; -^Z 1 — > 1 — and 
we get 

> min^qi$e,(V'f'^) s.t. 



For each i, = Trcrf with < Pi < l'^ (g) erf, 

Ti Pi < 1, and F{tp^'~' ^pi) > y/l — e^. Thus, forming w :— 
J2i1iPi ^ 5(AC) and a = J^ili'^ — 0' '^^ have Trw < 1, 
w < 1 (g) CT and 



6^ =: v/T^, 



where ? < 

This gives eventually 

so going back to Eq. ( |20l i, we arrive at 

> -^min(^|C')V'l 
= -f^max(^|C')p 

and we are done. I 

Proof of Lemma [77} Fix purifications ipf'^'^ of the pi 
so that p can be purified as 



,ASC| -xCo 



We use the following characterization of smooth max- 
entropies (cf. B31 ): 

2J^max(^|B)p. =mm\\TTAZ^\\ s.t. 



Fix optimal G ABC, such that = > 



Vl — e^, and > 0. Let A = max^ ||Tr/iZi|| and define 

M 



i=l 



SO that 



Furthermore, using Hayashi's pinching inequality 
the second line, 

M 

iwi = Ev^i^^xv^^i®i*)oi 

ij=l 

M 
i=l 



in 



7AB 



Co 



=: <g) l^'^'". 

I.e., ^/;' and Z are feasible for p, and the objective function 
value 



\Tt aZ\\ = 



y^MpiTrAZj 



< MX 

gives an upper bound to 2^.'-x(^|s)p. xhus we can conclude 

H:^^^{A\B)p < log X + log M 

= maxiJ,;,^(A|B)p, +logM, 

as advertised. ■ 

Proof of Proposition \T3\ To get bounds valid for all 
n, we use well-known tail estimates for sums of independent 
random variables due to Hoeffding |14|. To wit, consider the 
discrete random variable X with minimum non-zero probabil- 
ity min^ Px{x) 2~^' and let L = L{X) - log Fx (^), 
such that < L < /i with probabiHty 1, and EL = H{P). 
Then, for i.i.d. realizations Xi,X2, ■ ■ ■ , Xn of X, and associ- 
ated Li, Hoeffding's inequality states 



Pr i ^ > nH{P) + A^/H \ < 



— ^4. 



. i=l 



(21) 



Pr i ^ < nF(P) - Ay/^ \ < e" 



. 4=1 



We can use these bounds to construct typical projectors for 
a state p**", p G S{H), in the usual way. Let p = A;r|a;)(x| 
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be a diagonalization, so that A^. can be interpreted as a 
probability distribution on the x. Define two projectors 



3+A 



J2 k">(a;"| with 



r^+t := |x" = xi . . . a;„ : ^ - log A,. < nS{p) + A^/^j 



and 



J2 with 



By Eq. (EB, 



. .x„ : ^-logAj;, > nS{p) - A^/n^ 



where /i = log 1 1 "'^ 1 1 . 

Now, for a pure tripartite state \tp) e ABC, let A > and 
consider the projectors 



PZ :=P 



Pn 



Defining |*') := (1^ «) P^ «) P^ clearly we have 



> 1 - 2e 



for A = 1 / in - . By definition 



On the other hand, we just need to rescale P^ by its trace, 
(T :— r^^^p+ Pb to gst an eligible state in the definition of 

i/,^,i„(A|P). Note that TrP+ < 2"S(p)+Ambv^^ hence 

thus showing 



H:^,^{A\B) > nSiA\B) - {pB + ^ic)^n\n-. 

The upper bound on H^^^{A\B) follows by the duality of the 
min- and max-entropies, as well as that of the conditional von 
Neumann entropy: S{A\B) = -S{A\C). ■ 

Proof of Proposition \2U[ For a d-dimensional projector 
Q on A, write 



\ABE 



where y^Tq is the normalisation of the left hand side and 
\iPq)^'^^ is a state. Our goal is to show that we can find 



Q such that ipQ^ is close to a product state. To be precise, 
the claim is that there exists ip G S<: (E) and Q such that 



(22) 



Then, using the familiar decoupling argument, there is a cptp 
map V acting on B such that 

P{{id(g}V)ii^B,<S>QQ,) < 7^ + 2-3 (<i„(^l-E)v,-iogd)^ 

where $qq' is a maximally entangled state. Choosing 

I^q)^^' --^ ^AA\dtQ{Q ^ 

as the input state, so that Irpg)''^^^ = (1 (81 C/)|<?q), com- 
pletes the entanglement-generating code. Choosing logd < 
^min(^l^)V' ^ 4 log i guarantees that its error is < rj + e. 

To prove Eq. ( |22] |. choose a S S<{ABE) with P((p, V') < 
7? and -ffininl^l-^)!/- ^ Plmin{A\E)^. Consider the cptp map 



P:p. 



a 



where \Q) are orthogonal labels of a dummy system. By the 
contractiveness of the purified distance, we have 



P((P id)v3^^, (P (g) id)V'^^) < ry 



(23) 



We also have J dQ tg ~ 1. 
Now, Lemma mi below tells us 



noting 

(P (E) id){TA ®(p^)^ I dQTQ (g) (y9-^ (g) \Q){Q\, 



and that the trace norm on the left hand side is 



Q 



1^1 



By Eq. ([T]i, the trace norm bound implies 

P((P(gid)(^'^^,(P®id)(T^(g(^^)) < 2^3(^-"(^l^)--l°S'^). 

Substifiiting Hn-,iniA\E)^ = H'L^S^\E)^ and using 
Eq. ( |23T l with the triangle inequality for the purified distance, 
we get 

P((P (g id)V^^, (P g) id)(TA (g 

< ?7 + 2~3(^»in('^l^)*~'°s'^) -. 5. 

Equivalently, inserting the definition of ipQ and tg: 



Vl-S^ < F{{r g) id)V''^^, (P g) id)(TA (g 

dQ^Fi^^^,TQ(g>ip^). 
Since finally, by the concavity of the square root. 



J dQ^< J JdQtQ^ 1, 
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this implies that there exists Q in the previous integral with [23 
F{tpA^,TQ (g) ip^) > VT~P, which is precisely Eq. ■ 



Lemma 21 (Berta |[5]|) Let \ip) £ ABC be a state vector. 
Picking a d- dimensional projector Q uniformly (i.e. from the 
unitarily invariant measure dQ), we have 



dQ 



1^1 



\Q G SiA) on the 



with the maximally mixed state tq = 
support of Q. 
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